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This study compared a domain referenced approach with 
a traditional psychometric approach in the construction of a test. 
Results of the December, 1975 Quarterly Profile Exam (QPE) 
administered to 400 examinees at a university were the source of 
data. The 400 item QPE is a five alternative multiple choice test of 
information a "safe" physician should know. Content of the exam 
covers the broad areas of Internal Hedicine, Pediatrics, 
Obstetrics/Gynecology, Surgery, and Basic Science, as well as 
additional sub-topics. Por purposes of this study, two 75 item tests 
were constructed by pulling from the 400 item QPE by two different 
strategies. The domain referenced approach was used to construct a 75 
item test by a random sample of the 400 items. Selection of the 75 
items with the highest point biserial item-total correlations 
represented the traditional psychometric approach to test 
construction. The exams were then rescored to obtain scores and item 
analysis data on the random and psychometric tests. Then, the two 
tests were compared with respect to distribution of p values (the 
proportion answering an item correctly), point biserial item-total 
correlations, student scores across medical school year level and 
reliability. . The results were discussed with regard to their 
consistency with expectations of the domain referenced and 
psychometric approaches. (Author/RC) 
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Proponents of domain-referenced testing have emphasized 
the importance of a test accurately representing the domain 
which it represents. Development of the domain-referenced 
approach occurred in the context of the movement to in- 
crease specificity of educational objectives. Considering 
the merits of defining educational objectives over informal/ 
sometimes ambiguous, objectives, it was logical to conclude 
that educational tests should accurately represent the edu- 
cational objectives. In the domain-referenced approach one 
would define objectives and a corresponding domain of test 
items. Domain sampling has been described as more important 
than classic psychometric methods of test construction. The 
purpose of this study was to compare a domain-referenced ap- 
proach with a traditional psychometric approach to the con- 
struction of tests. 



Data analyzed in this study were derived from the administra- 
tion of the December, 1975 Quarterly Profile Exam (QPE) to 
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400 examinees. The examinee group included 354 medical stu- 
dents, 3 physician faculty, 30 interns/residents , 8 health 
professionals and 5 non-faculty physicians. The number of 
Year I-VI students were 75, &2, 64, 71, 39, and 23, respec- 
tively. Slide 1 illustrates this data* The students are 
enrolled in a six year, combined B.S. and M.D. program at 
the University of Missouri - Kansas City.- 

The 400 item QPE is a 5 alternative^ multiple choice test 
of _ information a "safe" physician should know. The QPE is 
one product of a computer-assisted test construction system 
functioning at the University of Missouri-Kansas City. This 
system in practice uses elements of domain-referenced and 
norm- referenced approaches. Content of the exam covers the 
broad areas of Internal Medicine*, Pediatrics, Obstetrics/ 
Gynecology, Surgery, and Basic Science, as well as additional 
sub-topics. For purposes of this study, two 75 item tests 
were constructed by pulling from the 400 item QPE by two 
different strategies. A random sample of the domain of 400 
QPE items produced 75 unique items which would constitute 
the domain-referenced test. Selection of the 75 items with 
the highest point biserial item-total correlations represented 
the traditional psychometric approach to test construction. 
See slide 2. 



The exams were then rescored to obtain scores and item analysis 
data for the domain-referenced and psychometric tests. Then, 
the two tests were compared with respect to score and item 



characteristics . 



3- 



RESULTS AND CONCLUSIONS 



The mean and standard deviation of scores for the domain- 
referenced test were, in raw score units, 34.6 and 9*64, 



respectively; those figures for the psychometric test \ere 
43.3 and 17.58, respectively. Mean performance was signi- 
ficantly (t = 13.90; p< .01} higher on the psychometric test; 
furthermore, score variance was significantly (t = 29.19; 
p < >01) greater on the psychometric test. However, the cor- 
relation of scores for the domain-referenced and psychome- 
tric tests was .904. The two scores correlated to a great 
extent, but differed with regard to central tendency and 
dispersion. 

In the context that the QPE is used, the performance across 
Years I-VI is more important than the overall mean and stan- 
* dard deviation previously discussed. The exam is intended 
to evaluate the acquisition of information through six years 
of matriculation. The mean and range of performance on the 
domain-referenced and psychometric tests are presented in 
percent correct units by Year level in slide 3. The fre- 
quency distribution of domain-referenced and psychometric 
scores is presented by Year level in slide 4. That the psy- 
chometric approach yielded scores with greater variability was 
previously noted. The frequency distributions of the two 
tests indicate that the psychometric approach separate the 
scores by year level better than the domain-referenced 
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approach* Compared with the domain-referenced test, the 
psychometric test is more difficult at the lower level and 
easier at the upper level of the student body. 

Now / consider the item characteristics of the two tests con- 
structed. Slide 5 presents the frequency distribution of 
p values for the domain-referenced and psychometric tests. 
The distribution of p values is positively skewed with the 
domain-referenced test; the psychometric approach yielded a 
distribution of p values more closely approximating normality. 
Slide 6 presents the frequency distribution of item- total 
correlations for the domain-referenced and psychometric tests. 
Considering the manner in which the items were chosen for the 
psychometric test, the psychometric test was expected to 
have a' more restricted range of item-total correlations than 
the domain-referenced test. It should be noted that the do- 
main-referenced test contained a majority of items with sig- 
nificant item-total correlations. This, too, was expected 
since the 400 item QPE has a high level of internal consis- 
tency; the K-R formula 20 reliability coefficient was .954 
for the QPE. Both the psychometric (.959) and domain-referenced 
(.845) reliability coefficients were quite respectable. 

One final point of comparison remains to be reported. The 
question was asked, "How well is the content of the 400 item 
QPE represented by the domain-referenced and psychometric 
tests?" Slide 7 presents data which compares the content 
of these two tests with that of the total QPE . Recall, the 



QPE covers five broad areas Internal Medicine, Pediatrics, 
Obstetrics/Gynecology, Surgery, and Basic Science. The pro- 
portional representation by content areas for the domain- 
referenced and psychometric tests did not significantly differ 
from that of the total QPE. 

In summary, the results of two approaches to test construc- 
tion differed with respect to central tendency, dispersion, 
and reliability. Also, the items of the two tests differed 
with respect to the distribution of p values and item-total 
correlations; they did not, however, differ with respect to 
the content of items compared with the total item pool. 

A couple more points should be made before concluding this 
paper. The domain referenced approach did yield a reliable 
measurement, although somewhat less reliable than the psycho- 
metric approach. This is probably due to the lengthy process 
of item generation and review involved in creating the QPE. 

A second concluding point to consider involves the question 
of how fair to the domain-referenced concept is the random 
sample approach used in this study. A better approach might 
have been to use the D% (Year VI percent correct - Year I per- 
cent correct) to rank and select items. Well, that approach 
was attempted; however, the relationship of Di to item- total 
correlation was so high that the two approaches yielded 75 
items each with 6 7 item common to both tests. 
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